-
Notifications
You must be signed in to change notification settings - Fork 3k
HTML API: Handle \f in skip_script_data tag matching #9402
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: trunk
Are you sure you want to change the base?
HTML API: Handle \f in skip_script_data tag matching #9402
Conversation
The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the Core Committers: Use this line as a base for the props when committing in SVN:
To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook. |
'>' !== $c && | ||
' ' !== $c && | ||
"\n" !== $c && | ||
'/' !== $c && | ||
"\t" !== $c && | ||
"\f" !== $c && | ||
"\r" !== $c |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ordered these by how often I'd expect the character to be seen in this position. I don't expect any real performance improvements from that part of the change, but also don't see any down side to having >
and
appear as the first and second match opportunities.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my understanding is that these are all likely to be performed in parallel and executed before the CPU even reaches these lines, so yeah, I would guess this is true, but without measurement also lean on not knowing. shouldn’t matter in any case, and unless someone has realistic benchmarks on realistic data, I would be skeptical of any performance claims on the position of these items.
Test using WordPress PlaygroundThe changes in this pull request can previewed and tested using a WordPress Playground instance. WordPress Playground is an experimental project that creates a full WordPress instance entirely within the browser. Some things to be aware of
For more details about these limitations and more, check out the Limitations page in the WordPress Playground documentation. |
@@ -2012,6 +2012,7 @@ public function test_script_tag_parsing( string $input, bool $closes ) { | |||
public static function data_script_tag(): array { | |||
return array( | |||
'Basic script tag' => array( '<script></script>', true ), | |||
'Basic script tag with </script\f> close' => array( "<script></script\f>", true ), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
given that we’re testing a class of terminations here we could extend these to add all of the relevant characters. for now I think this patch is great to go anyway, but I do think we would have some valuable work for someone to refactor some of the existing tests from the original build of the Tag Processor.
there’s probably something to be said about recreating the state machine from the spec and testing each of its branches.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It beguiles me how we missed this earlier on since that is so clear in the spec.
Before merging, would you be interested in two changes:
- Move the comment above the
if
structure so it’s not awkwardly inside the condition? If you prefer it stay I trust your thoughts, but I think it could make this condition stand out more cleanly if we had an empty line above and below it and the comment explaining this. - Add a second
*
to make it a PHPDoc comment so that the@see
links integrate with IDEs more smoothly, and remove the leading-
so that they stand on their own. Otherwise, if we keep them as list items,{@see https://…}
might be more appropriate.
The comment is verbose and I think it doesn’t need as much explanation, or even a single link to the SCRIPT parsing suffices, but being verbose is a decent default for comments.
Trac ticket: https://core.trac.wordpress.org/ticket/63738
Address a minor HTML API mis-parse of script contents, where
\f
is not recognized as a valid trailing/termination character ofSCRIPT
tag names.In this case, the
\f
form feed should be recognized as the end of thescript
tag name and close the script:before / after
This is another example of the same issue. Again,
\f
form feed should be recognized as the end of thescript
tag name and enter the double-escaped state (this script tag is not closed and consumes the rest of the document):In this case
<script<
is correctly recognized as not a sequence that should transition from escaped to double-escaped, however it incorrectly advances beyond the following<
character that starts the script close tag and does not close correctly at</script>
.before / after
This Pull Request is for code review only. Please keep all other discussion in the Trac ticket. Do not merge this Pull Request. See GitHub Pull Requests for Code Review in the Core Handbook for more details.